Goto

Collaborating Authors

 work experience


JobHop: A Large-Scale Dataset of Career Trajectories

Johary, Iman, Romero, Raphael, Mara, Alexandru C., De Bie, Tijl

arXiv.org Artificial Intelligence

Understanding labor market dynamics is essential for policymakers, employers, and job seekers. However, comprehensive datasets that capture real-world career trajectories are scarce. In this paper, we introduce JobHop, a large-scale public dataset derived from anonymized resumes provided by VDAB, the public employment service in Flanders, Belgium. Utilizing Large Language Models (LLMs), we process unstructured resume data to extract structured career information, which is then normalized to standardized ESCO occupation codes using a multi-label classification model. This results in a rich dataset of over 1.67 million work experiences, extracted from and grouped into more than 361,000 user resumes and mapped to standardized ESCO occupation codes, offering valuable insights into real-world occupational transitions. This dataset enables diverse applications, such as analyzing labor market mobility, job stability, and the effects of career breaks on occupational transitions. It also supports career path prediction and other data-driven decision-making processes. To illustrate its potential, we explore key dataset characteristics, including job distributions, career breaks, and job transitions, demonstrating its value for advancing labor market research.


4 Best Resume Builders (2025), Tested and Reviewed

WIRED

A proper résumé builder can help you stand out from the crowd. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. Landing a new job starts with a slam-dunk CV, but Word or Google Docs will only take you so far. In an era of AI-assisted Application Tracking Systems (ATS) where applicants are rejected before a human ever lays eyes on their work experience, the best résumé builders could help you stand out from the crowd. Yes, they'll give your résumé a pretty face with unique templates and attention-grabbing fonts, but résumé builders shouldn't just give you a PDF and send you on your way. A good online résumé builder will give tips and insights into how you can bolster your CV, as well as tools for managing your documents when applying to multiple jobs.


Layout-Aware Parsing Meets Efficient LLMs: A Unified, Scalable Framework for Resume Information Extraction and Evaluation

Zhu, Fanwei, Yu, Jinke, Chen, Zulong, Zhou, Ying, Ji, Junhao, Yang, Zhibo, Zhang, Yuxue, Hu, Haoyuan, Liu, Zhenghao

arXiv.org Artificial Intelligence

Automated resume information extraction is critical for scaling talent acquisition, yet its real-world deployment faces three major challenges: the extreme heterogeneity of resume layouts and content, the high cost and latency of large language models (LLMs), and the lack of standardized datasets and evaluation tools. In this work, we present a layout-aware and efficiency-optimized framework for automated extraction and evaluation that addresses all three challenges. Our system combines a fine-tuned layout parser to normalize diverse document formats, an inference-efficient LLM extractor based on parallel prompting and instruction tuning, and a robust two-stage automated evaluation framework supported by new benchmark datasets. Extensive experiments show that our framework significantly outperforms strong baselines in both accuracy and efficiency. In particular, we demonstrate that a fine-tuned compact 0.6B LLM achieves top-tier accuracy while significantly reducing inference latency and computational cost. The system is fully deployed in Alibaba's intelligent HR platform, supporting real-time applications across its business units.


MHSNet:An MoE-based Hierarchical Semantic Representation Network for Accurate Duplicate Resume Detection with Large Language Model

Li, Yu, Chen, Zulong, Xu, Wenjian, Wen, Hong, Yu, Yipeng, Yiu, Man Lung, Yin, Yuyu

arXiv.org Artificial Intelligence

To maintain the company's talent pool, recruiters need to continuously search for resumes from third-party websites (e.g., LinkedIn, Indeed). However, fetched resumes are often incomplete and inaccurate. To improve the quality of third-party resumes and enrich the company's talent pool, it is essential to conduct duplication detection between the fetched resumes and those already in the company's talent pool. Such duplication detection is challenging due to the semantic complexity, structural heterogeneity, and information incompleteness of resume texts. To this end, we propose MHSNet, an multi-level identity verification framework that fine-tunes BGE-M3 using contrastive learning. With the fine-tuned , Mixture-of-Experts (MoE) generates multi-level sparse and dense representations for resumes, enabling the computation of corresponding multi-level semantic similarities. Moreover, the state-aware Mixture-of-Experts (MoE) is employed in MHSNet to handle diverse incomplete resumes. Experimental results verify the effectiveness of MHSNet


AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening

Lo, Frank P. -W., Qiu, Jianing, Wang, Zeyu, Yu, Haibao, Chen, Yeming, Zhang, Gao, Lo, Benny

arXiv.org Artificial Intelligence

Resume screening is a critical yet time-intensive process in talent acquisition, requiring recruiters to analyze vast volume of job applications while remaining objective, accurate, and fair . With the advancements in Large Language Models (LLMs), their reasoning capabilities and extensive knowledge bases demonstrate new opportunities to streamline and automate recruitment workflows. In this work, we propose a multi-agent framework for resume screening using LLMs to systematically process and evaluate resumes. The framework consists of four core agents, including a resume extractor, an evaluator, a summarizer, and a score for-matter . T o enhance the contextual relevance of candidate assessments, we integrate Retrieval-Augmented Generation (RAG) within the resume evaluator, allowing incorporation of external knowledge sources, such as industry-specific expertise, professional certifications, university rankings, and company-specific hiring criteria. This dynamic adaptation enables personalized recruitment, bridging the gap between AI automation and talent acquisition. W e assess the effectiveness of our approach by comparing AI-generated scores with ratings provided by HR professionals on a dataset of anonymized online resumes.


Automating Venture Capital: Founder assessment using LLM-powered segmentation, feature engineering and automated labeling techniques

Ozince, Ekin, Ihlamur, Yiğit

arXiv.org Artificial Intelligence

This study explores the application of large language models (LLMs) in venture capital (VC) decision-making, focusing on predicting startup success based on founder characteristics. We utilize LLM prompting techniques, like chain-of-thought, to generate features from limited data, then extract insights through statistics and machine learning. Our results reveal potential relationships between certain founder characteristics and success, as well as demonstrate the effectiveness of these characteristics in prediction. This framework for integrating ML techniques and LLMs has vast potential for improving startup success prediction, with important implications for VC firms seeking to optimize their investment strategies.


The Silicon Ceiling: Auditing GPT's Race and Gender Biases in Hiring

Armstrong, Lena, Liu, Abbey, MacNeil, Stephen, Metaxa, Danaë

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly being introduced in workplace settings, with the goals of improving efficiency and fairness. However, concerns have arisen regarding these models' potential to reflect or exacerbate social biases and stereotypes. This study explores the potential impact of LLMs on hiring practices. To do so, we conduct an algorithm audit of race and gender biases in one commonly-used LLM, OpenAI's GPT-3.5, taking inspiration from the history of traditional offline resume audits. We conduct two studies using names with varied race and gender connotations: resume assessment (Study 1) and resume generation (Study 2). In Study 1, we ask GPT to score resumes with 32 different names (4 names for each combination of the 2 gender and 4 racial groups) and two anonymous options across 10 occupations and 3 evaluation tasks (overall rating, willingness to interview, and hireability). We find that the model reflects some biases based on stereotypes. In Study 2, we prompt GPT to create resumes (10 for each name) for fictitious job candidates. When generating resumes, GPT reveals underlying biases; women's resumes had occupations with less experience, while Asian and Hispanic resumes had immigrant markers, such as non-native English and non-U.S. education and work experiences. Our findings contribute to a growing body of literature on LLM biases, in particular when used in workplace contexts.


Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Wu, Likang, Qiu, Zhaopeng, Zheng, Zhi, Zhu, Hengshu, Chen, Enhong

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized natural language processing tasks, demonstrating their exceptional capabilities in various domains. However, their potential for behavior graph understanding in job recommendations remains largely unexplored. This paper focuses on unveiling the capability of large language models in understanding behavior graphs and leveraging this understanding to enhance recommendations in online recruitment, including the promotion of out-of-distribution (OOD) application. We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs and uncover underlying patterns and relationships. Specifically, we propose a meta-path prompt constructor that leverages LLM recommender to understand behavior graphs for the first time and design a corresponding path augmentation module to alleviate the prompt bias introduced by path-based sequence input. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users. We evaluate the effectiveness of our approach on a comprehensive dataset and demonstrate its ability to improve the relevance and quality of recommended quality. This research not only sheds light on the untapped potential of large language models but also provides valuable insights for developing advanced recommendation systems in the recruitment market. The findings contribute to the growing field of natural language processing and offer practical implications for enhancing job search experiences. We release the code at https://github.com/WLiK/GLRec.


Explainable Product Classification for Customs

Lee, Eunji, Kim, Sihyeon, Kim, Sundong, Jung, Soyeon, Kim, Heeja, Cha, Meeyoung

arXiv.org Artificial Intelligence

The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the most likely subheadings (i.e., the first six digits) of the HS code. The model also provides reasoning for its suggestion in the form of a document that is interpretable by customs officers. We evaluated the model using 5,000 cases that recently received a classification request. The results showed that the top-3 suggestions made by our model had an accuracy of 93.9\% when classifying 925 challenging subheadings. A user study with 32 customs experts further confirmed that our algorithmic suggestions accompanied by explainable reasonings, can substantially reduce the time and effort taken by customs officers for classification reviews.


Career Path Prediction using Resume Representation Learning and Skill-based Matching

Decorte, Jens-Joris, Van Hautte, Jeroen, Deleu, Johannes, Develder, Chris, Demeester, Thomas

arXiv.org Artificial Intelligence

The impact of person-job fit on job satisfaction and performance is widely acknowledged, which highlights the importance of providing workers with next steps at the right time in their career. This task of predicting the next step in a career is known as career path prediction, and has diverse applications such as turnover prevention and internal job mobility. Existing methods to career path prediction rely on large amounts of private career history data to model the interactions between job titles and companies. We propose leveraging the unexplored textual descriptions that are part of work experience sections in resumes. We introduce a structured dataset of 2,164 anonymized career histories, annotated with ESCO occupation labels. Based on this dataset, we present a novel representation learning approach, CareerBERT, specifically designed for work history data. We develop a skill-based model and a text-based model for career path prediction, which achieve 35.24% and 39.61% recall@10 respectively on our dataset. Finally, we show that both approaches are complementary as a hybrid approach achieves the strongest result with 43.01%